我们从控制理论限制的角度研究随机策略梯度方法。我们的主要结果是,在Doyle的意义上,不可避免的线性系统不可避免地导致嘈杂的梯度估计。我们还举例说明了一类稳定系统的示例,其中政策梯度方法遭受了维度的诅咒。我们的结果适用于状态反馈和部分观察到的系统。
translated by 谷歌翻译
本文介绍了局部最低限度的遗憾,用于自适应控制线性 - 四爵士(LQG)系统的下限。我们考虑平滑参数化实例,并在对数遗憾时提供了对实例的特定和灵活性,以考虑到问题结构。这种理解依赖于两个关键概念:局部无规格的概念;当最佳策略没有提供足够的激励以确定最佳政策,并产生退化的Fisher信息矩阵;以及信息遗憾的界限,当政策依赖信息矩阵的小特征值在该政策的遗憾方面是无限的。结合减少贝叶斯估计和范树的应用,这两个条件足以证明遗憾的界限为时间$ \ sqrt {t} $ \ sqrt {t} $ of the the theaign,$ t $。该方法产生低界,其具有与控制理论问题常数自然的紧密依赖性和规模。例如,我们能够证明在边缘稳定性附近运行的系统从根本上难以学习控制。我们进一步表明,大类系统满足这些条件,其中任何具有$ a $的状态反馈系统 - 和$ b $ -matrices未知。最重要的是,我们还建立了一个非活动类别的部分可观察系统,基本上是那些过度启动的那些满足这些条件,从而提供$ \ SQRT {T} $下限对部分可观察系统也有效。最后,我们转到两个简单的例子,表明我们的下限捕获了经典控制 - 理论直觉:我们的下限用于在边际稳定性附近或大过滤器增益的近方行,这些系统可以任意难以努力(学习到)控制。
translated by 谷歌翻译
Large, labeled datasets have driven deep learning methods to achieve expert-level performance on a variety of medical imaging tasks. We present CheXpert, a large dataset that contains 224,316 chest radiographs of 65,240 patients. We design a labeler to automatically detect the presence of 14 observations in radiology reports, capturing uncertainties inherent in radiograph interpretation. We investigate different approaches to using the uncertainty labels for training convolutional neural networks that output the probability of these observations given the available frontal and lateral radiographs. On a validation set of 200 chest radiographic studies which were manually annotated by 3 board-certified radiologists, we find that different uncertainty approaches are useful for different pathologies. We then evaluate our best model on a test set composed of 500 chest radiographic studies annotated by a consensus of 5 board-certified radiologists, and compare the performance of our model to that of 3 additional radiologists in the detection of 5 selected pathologies. On Cardiomegaly, Edema, and Pleural Effusion, the model ROC and PR curves lie above all 3 radiologist operating points. We release the dataset to the public as a standard benchmark to evaluate performance of chest radiograph interpretation models. 1
translated by 谷歌翻译
Riemannian geometry provides powerful tools to explore the latent space of generative models while preserving the inherent structure of the data manifold. Lengths, energies and volume measures can be derived from a pullback metric, defined through the immersion that maps the latent space to the data space. With this in mind, most generative models are stochastic, and so is the pullback metric. Manipulating stochastic objects is strenuous in practice. In order to perform operations such as interpolations, or measuring the distance between data points, we need a deterministic approximation of the pullback metric. In this work, we are defining a new metric as the expected length derived from the stochastic pullback metric. We show this metric is Finslerian, and we compare it with the expected pullback metric. In high dimensions, we show that the metrics converge to each other at a rate of $\mathcal{O}\left(\frac{1}{D}\right)$.
translated by 谷歌翻译
Insects as pollinators play a key role in ecosystem management and world food production. However, insect populations are declining, calling for a necessary global demand of insect monitoring. Existing methods analyze video or time-lapse images of insects in nature, but the analysis is challenging since insects are small objects in complex and dynamic scenes of natural vegetation. The current paper provides a dataset of primary honeybees visiting three different plant species during two months of summer-period. The dataset consists of more than 700,000 time-lapse images from multiple cameras, including more than 100,000 annotated images. The paper presents a new method pipeline for detecting insects in time-lapse RGB-images. The pipeline consists of a two-step process. Firstly, the time-lapse RGB-images are preprocessed to enhance insects in the images. We propose a new prepossessing enhancement method: Motion-Informed-enhancement. The technique uses motion and colors to enhance insects in images. The enhanced images are subsequently fed into a Convolutional Neural network (CNN) object detector. Motion-Informed-enhancement improves the deep learning object detectors You Only Look Once (YOLO) and Faster Region-based Convolutional Neural Networks (Faster R-CNN). Using Motion-Informed-enhancement the YOLO-detector improves average micro F1-score from 0.49 to 0.71, and the Faster R-CNN-detector improves average micro F1-score from 0.32 to 0.56 on the our dataset. Our datasets are published on: https://vision.eng.au.dk/mie/
translated by 谷歌翻译
In today's uncertain and competitive market, where enterprises are subjected to increasingly shortened product life-cycles and frequent volume changes, reconfigurable manufacturing systems (RMS) applications play a significant role in the manufacturing industry's success. Despite the advantages offered by RMS, achieving a high-efficiency degree constitutes a challenging task for stakeholders and decision-makers when they face the trade-off decisions inherent in these complex systems. This study addresses work tasks and resource allocations to workstations together with buffer capacity allocation in RMS. The aim is to simultaneously maximize throughput and minimize total buffer capacity under fluctuating production volumes and capacity changes while considering the stochastic behavior of the system. An enhanced simulation-based multi-objective optimization (SMO) approach with customized simulation and optimization components is proposed to address the abovementioned challenges. Apart from presenting the optimal solutions subject to volume and capacity changes, the proposed approach support decision-makers with discovered knowledge to further understand the RMS design. In particular, this study presents a problem-specific customized SMO combined with a novel flexible pattern mining method for optimizing RMS and conducting post-optimal analyzes. To this extent, this study demonstrates the benefits of applying SMO and knowledge discovery methods for fast decision-support and production planning of RMS.
translated by 谷歌翻译
Adaptation-relevant predictions of climate change are often derived by combining climate models in a multi-model ensemble. Model evaluation methods used in performance-based ensemble weighting schemes have limitations in the context of high-impact extreme events. We introduce a locally time-invariant model evaluation method with focus on assessing the simulation of extremes. We explore the behaviour of the proposed method in predicting extreme heat days in Nairobi.
translated by 谷歌翻译
This paper presents an accurate, highly efficient, and learning-free method for large-scale odometry estimation using spinning radar, empirically found to generalize well across very diverse environments -- outdoors, from urban to woodland, and indoors in warehouses and mines - without changing parameters. Our method integrates motion compensation within a sweep with one-to-many scan registration that minimizes distances between nearby oriented surface points and mitigates outliers with a robust loss function. Extending our previous approach CFEAR, we present an in-depth investigation on a wider range of data sets, quantifying the importance of filtering, resolution, registration cost and loss functions, keyframe history, and motion compensation. We present a new solving strategy and configuration that overcomes previous issues with sparsity and bias, and improves our state-of-the-art by 38%, thus, surprisingly, outperforming radar SLAM and approaching lidar SLAM. The most accurate configuration achieves 1.09% error at 5Hz on the Oxford benchmark, and the fastest achieves 1.79% error at 160Hz.
translated by 谷歌翻译
Biological cortical networks are potentially fully recurrent networks without any distinct output layer, where recognition may instead rely on the distribution of activity across its neurons. Because such biological networks can have rich dynamics, they are well-designed to cope with dynamical interactions of the types that occur in nature, while traditional machine learning networks may struggle to make sense of such data. Here we connected a simple model neuronal network (based on the 'linear summation neuron model' featuring biologically realistic dynamics (LSM), consisting of 10 of excitatory and 10 inhibitory neurons, randomly connected) to a robot finger with multiple types of force sensors when interacting with materials of different levels of compliance. Scope: to explore the performance of the network on classification accuracy. Therefore, we compared the performance of the network output with principal component analysis of statistical features of the sensory data as well as its mechanical properties. Remarkably, even though the LSM was a very small and untrained network, and merely designed to provide rich internal network dynamics while the neuron model itself was highly simplified, we found that the LSM outperformed these other statistical approaches in terms of accuracy.
translated by 谷歌翻译
Recent 3D-based manipulation methods either directly predict the grasp pose using 3D neural networks, or solve the grasp pose using similar objects retrieved from shape databases. However, the former faces generalizability challenges when testing with new robot arms or unseen objects; and the latter assumes that similar objects exist in the databases. We hypothesize that recent 3D modeling methods provides a path towards building digital replica of the evaluation scene that affords physical simulation and supports robust manipulation algorithm learning. We propose to reconstruct high-quality meshes from real-world point clouds using state-of-the-art neural surface reconstruction method (the Real2Sim step). Because most simulators take meshes for fast simulation, the reconstructed meshes enable grasp pose labels generation without human efforts. The generated labels can train grasp network that performs robustly in the real evaluation scene (the Sim2Real step). In synthetic and real experiments, we show that the Real2Sim2Real pipeline performs better than baseline grasp networks trained with a large dataset and a grasp sampling method with retrieval-based reconstruction. The benefit of the Real2Sim2Real pipeline comes from 1) decoupling scene modeling and grasp sampling into sub-problems, and 2) both sub-problems can be solved with sufficiently high quality using recent 3D learning algorithms and mesh-based physical simulation techniques.
translated by 谷歌翻译